This assignment is for ETC5521 Assignment 1 by Team Lorikeet comprising of Aryan Jain, Emily Sheehan, Jimmy Effendy, and DIYAO CHEN.

1 Introduction and Motivation

Measles is a highly infectious disease caused by the Measles virus. It can lead to pneumonia, infections of the middle ear, swelling of the brain and death.

A vaccine exists to prevent the onset of measles as there is no treatment. The vaccine involves the injection of attenuated measles antigens that stimulate the production of antibodies and memory cells, providing long-term protection against the virus. When administered properly, the vaccine is 90.5% effective within 72 hours of exposure (Barrabeig et al., 2011).

Unfortunately there is a growing number of individuals refusing vaccination, particularly in the US (Phadke et al., 2016). In Texas, the number of unvaccinated children attaining exemptions to attend school has increased by 28 times since 2003 (Sinclair et al., 2019). This has led to several outbreaks of vaccine preventable diseases, such as Measles. If this trend continues, there could be calamitous consequences.

Knowing the drivers behind the level of MMR vaccination rates is therefore imperative. This paper hopes to determine whether there is a relationship between socioeconomic status and MMR vaccination rate. Specifically, it explores how MMR vaccination rates fluctuate across different school types, states, income levels, enrollment numbers, educational attainment level, and proportion of foregin born populations. It will also compare the MMR vaccination rates against the overall vaccination rates.

Specifically, this paper hopes to answer the following questions:

  • Primary Question:

    • Does Measles vaccination rate improve with better socioeconomic conditions?
  • Secondary Questions:

    • Are the MMR vaccination rates higher in private schools?
    • How does school’s MMR Vaccination rate compare to the school’s overall vaccination rate?
    • Which states have the lowest vaccination rates?
    • Does higher income per capita lead to higher MMR vaccination rates?
    • Are MMR vaccination rates lower in areas with higher proportion of foreign-born population?
    • Does regions with better educational attainments have higher MMR vaccination rates?

First, the paper will discuss the data used and how it is prepared for the analysis. Then, analysis and findings about the research questions will be presented and discussed.

2 Data Description

To analyse the relationship a dataset was retrieved from Wall Street Journal (WSJ). The data comprises of vaccination rates for 46,412 schools in 32 U.S states and was retrieved from The Wall Street Journal. The variables include; the school academic year, the school’s state, city, county, district, name, type, enrollment, MMR (measles, mumps and rubella) vaccination rate, overall vaccination rate, latitude, longitude and the percentage of students exempted from vaccinations due to personal, religious or medical reasons. The state health departments provided the vaccination data and the National Center for Education Statistic’s provided the school location, which was matched against the school name. In the case that there was no match, the school’s location was found with Google Maps API.

Additional data from the U.S. Census Bureau (United States Census Buraeu 2018) is also retrieved for 2018 county level income per capita, educational attainment level, and the number of foreign-born population. This was done by utilizing Census data API provided by the Census Bureau as well as with tidycensus package.

2.1 Data Limitation

One of the limitation of the WSJ measles data is that there is inconsistencies in data collection methods. The data was collected in the 2017-18 school year for 11 states, but for the remaining 21 states, it was collected in 2018-19 school year. Moreover, with the help of naniar package, it can be easily identified that this dataset has a considerable amount of missing values. Although every precaution has been taken to ensure accurate figures have been calculated, some of the MMR rates, overall vaccination rates and school types were missing from the original dataset. The following variables are largely unusable as due to its high number of missing values:

  • xrel: the percentage of students exempted from vaccinations due to personal reasons
  • xmed: the percentage of students exempted from vaccinations due to medical reasons
  • xper: the percentage of students exempted from vaccinations due to religious reasons
  • district: school district

2.2 Data Cleaning and Transformation

The individual state dataset was scraped from the Tidy Tuesday Github repository and combined with the existing measles dataset with left_join to extract the longitude and latitude variables from it. Various functions from the rvest package were used to scrape the data including read_html and html_table.

A considerable amount of data wrangling needed to be done for the U.S. census dataset as they do not provide descriptions of what each variable represents (e.g. variable B19301_001 represents Income Per Capita). In addition, variable county_state, comprising of county and state, needed to be added for the measle and U.S. census dataset. This variable is used as a key to merge the measles and U.S. census dataset. This is achieved by utilizing tidyverse and janitor packages.

3 Analysis and Findings

3.1 Are the MMR Vaccination Rates Higher in Private Schools?

Box plot of School's MMR Vaccination Rates by School Types

Figure 3.1: Box plot of School’s MMR Vaccination Rates by School Types

This section focuses on private schools’ MMR vaccination rates compared to other type of schools. It can be argued that school types can be used as a dimension to represent socioeconomic characteristics. In comparison to other school types, the tuition fee for private schools are generally higher than public school (Kerr 2019). This partly due to the fact that public school receive funding from the government while private school are privately funded.

The distribution of the MMR vaccination rates across school types is reflected in ridgeline plot in Figure 3.1; with type of school in the y-axis, and MMR vaccination rate in the x-axis. As the plot shows, the distribution of MMR vaccination rates across school types are skewed to the left. This means that school types have a considerable amount of outliers which values are small compared to the rest of the observations. In addition, some of the school types, such as private, non-public, and charter schools, have multimodality characteristics.

Table 3.1: Comparison of MMR Average Vaccination Rates according to School Type
Type Average MMR Vaccination Rate (%)
BOCES 98.75
Public 96.16
Nonpublic 94.38
Kindergarten 94.21
Private 93.32
Charter 87.96

Table 3.1 shows the 2018/2019 average MMR vaccination rates across different school types in USA. The table shows that Boards of Cooperative Education Services (BOCES) and public school have the highest rate of MMR vaccination rates compared to other school types. In contrast, private schools have the second lowest MMR vaccination rates. This is consistent with findings from a study conducted by Shaw (2014) where it was found that private schools have higher rates of exemptions for immunisations than public schools.

3.2 How Does School’s MMR Vaccination Rate Compare to the School’s Overall Vaccination Rate?

Density plot of Differences in School's MMR and Overall Vaccination Rates

Figure 3.2: Density plot of Differences in School’s MMR and Overall Vaccination Rates

In this section, the report will perform a comparative analysis between school’s 2018/2019 MMR and overall vaccination rates in USA. The distribution of these differences in vaccination rates are reflected in density plots in Figure 3.2. Similar with the previous section, these distributions have a fair amount of outliers. Kindergartens have the most dispersed distributions, while private schools have the least. The distribution of the difference in vaccination rates in public schools have strong multimodality characteristics.

Table 3.2: School’s MMR Vaccination Rate and Overall Vaccination Rate Comparison
School Type School MMR Vaccination Rate (%) School Overall Vaccination Rate (%) Rate Differences (%)
Kindergarten 94.20 87.99 6.21
Private 93.16 91.37 1.78
Public 95.90 94.51 1.38

The summary of vaccination rates comparison are shown in Table 3.2. Compared to the previous section, the table only reflects three school types. This is due to the fact that only three types of school that have observations of vaccination rates for both MMR and overall vaccination rates in the WSJ dataset. Table 3.2 shows that kindergartens have 6.21% difference in MMR and overall vaccination rates. In contrast, private and public schools have similar MMR and overall vaccination rates.

3.3 Which states have lowest vaccination rates?

Bar Chart of the Proportion of School with Less than 95% Vaccination Rates

Figure 3.3: Bar Chart of the Proportion of School with Less than 95% Vaccination Rates

Schools with low vaccination rates across the state will be examined in this section. In particular, this section will explore the proportion of schools with MMR vaccination rates less than 95% across states in USA. According to California Department of Public Health, at least 95% of MMR vaccination rates needed to be achieved to prevent community disease transmission (Lambert and Willis 2019).

Figure 3.3 reflects the proportion of school with low vaccination rates in a bar chart. It shows that there are 11 states which proportion is lower then the average states’ proportion. Arkansas, however, have a worryingly high proportion of schools with low vaccination rates at 99.65%. Arkansas only has 2 schools out of 567 that has MMR vaccination rates higher than 95%.

<<<<<<< HEAD
=======
>>>>>>> jeffendy

Figure 3.4: Map of School with MMR Vaccination Rates Less than 95%

The measles data was grouped by state and then the proportion of schools with less than 95% MMR vaccination rate were calculated. Then, the map_data function was used to create a tibble containing the geographical information of each state. This data was merged with the measles_states data, which contains the proportion for each state. Any missing data or negative values were removed and the remaining data was plotted onto a map and bar chart using geom_polygon and geom_col, respectively. The ggplotly function was used to make the maps interactive.

The proportion of schools with less than 95% MMR vaccination rates are reflected in a map in Figure 3.4. States that are not within the scope of the WSJ dataset are filled with grey. California and most of the Northeast region of the U.S. have a relatively low proportion of schools that have less than 95% MMR vaccination rate. It can be argued that there is no strong association between low MMR vaccination rates with geography. The proportion of schools with low vaccination rates appears to be scattered without pattern across the region.

3.4 Does Higher Income Per Capita Lead to Higher MMR Vaccination Rates?

To analyse the average income of the states with the highest and lowest vaccination rate, an external dataset from U.S. Census Bureau was retrieved. This data was merged with the measles data grouped by state, and the top and bottom five observations were tabulated for both the vaccination rates. Finally, income quantiles for each states are determined; with states in quantile 1 are states with lowest income per capita, and those in quantile 4 have highest income per capita.

Table 3.3: The Per Capita Income of the States with the Highest MMR Vaccination Rate
States MMR Vaccination Rate (%) Per Capita Income Income Quantiles
Illinois 97.62 $28,105.72 2
Connecticut 96.49 $41,021.25 4
Massachusetts 96.26 $40,222.43 4
South Dakota 95.25 $28,617.14 2
Vermont 95.19 $31,966.00 4
Table 3.4: The Per Capita Income of the States with the Lowest MMR Vaccination Rate
States MMR Vaccination Rate (%) Per Capita Income Income Quantiles
Washington 88.14 $29,274.29 3
Minnesota 90.89 $30,827.47 3
Arizona 91.38 $23,459.40 1
Maine 92.24 $28,983.25 2
Texas 92.32 $27,504.22 1

Table 3.3 shows the top five states, with their respective income per capita, that have the highest rate of MMR vaccinations. These schools are based in states with varying level of income per capita. It ranges from USD 28,105 to USD 41,021. The table also highlights that on average, schools that have highest MMR vaccination rates are based on states with high and medium level of income per capita (quantile 2 and 4).

Table 3.4, on the other hand, shows the top five states that have the lowest rates of MMR vaccination. Similar to the previous table, these schools are based states with varying level of income per capita. The states with the highest income quantile, however, do not have the lowest MMR vaccination rate.

It can be argued from these two tables that income per capita is not a good indicator for MMR vaccination rates.

Scatter plot of School's MMR Vaccination Rates by Income Per Capita

Figure 3.5: Scatter plot of School’s MMR Vaccination Rates by Income Per Capita

Box plot of School's MMR Vaccination Rates by Income Quantiles

Figure 3.6: Box plot of School’s MMR Vaccination Rates by Income Quantiles

Figure 3.5 reflects that income per capita have varying effect to MMR vaccination rates across different states. While linear association can be easily determined in some of the states, the relationship of the two variables are difficult to ascertain in most of the states.

The distribution of MMR vaccination rates across the different income quantiles are plotted in Figure 3.6. The highest MMR vaccination rates occur in schools that are based on the lowest income quantile. The figure suggests that higher income per capita does not lead to higher MMR vaccination rates.

3.5 Are MMR Vaccination Rates Lower in Areas with Higher Proportion of Foreign Born Population?

Lolipop plot of Foreign Born Population Proportion by States

Figure 3.7: Lolipop plot of Foreign Born Population Proportion by States

This section attempts to analyse whether MMR vaccinations are lower in regions where the proportion of foreign-born populations is high. The proportion of foreign-born population compared to total populations by each state are reflected in Figure 3.7. The figure suggests that state which population has the highest foreign-born proportion is New Jersey with 0.17. In contrast, West Virginia has the lowest proportion with only 0.01 of its total population are foreign-born.

Scatter plot of Foreign Born Population Proportion by Vaccination Rates

Figure 3.8: Scatter plot of Foreign Born Population Proportion by Vaccination Rates

To determine whether there are associations between MMR vaccination rates and the proportion of foreign born populations, a scatterplot is utilised. This is reflected in Figure 3.8 where MMR vaccination rates are plotted in the y-axis, and the proportions are plotted in the x-axis; both variables are summarised at county level. The figure suggests that there is no strong associations between the two variables. The regression line in the plot, nevertheless, indicates that there may be a weak positive linear relationship between the variables. This means that the MMR vaccination rates may increase as the proportion of foreign born population increases in the county. This, however, is not aligned with the general research that is publicly available. A study found that persons that were born outside of the U.S. have a high risk of under-vaccinations (Lu et al. 2014).

3.6 Does Regions with Better Educational Attainments Have higher MMR vaccination rates?

<<<<<<< HEAD
=======
>>>>>>> jeffendy

Figure 3.9: Stacked Bar Chart of MMR Vaccination Rates and Educational Attainment

This section aims to determine whether better educational attainments can lead to higher MMR vaccination rates. The educational attainment level is calculated by combining the proportion of high school graduates and bachelor graduates. The distribution between MMR vaccination rates and educational attainment level across states are reflected in Figure 3.9. The figure shows that while Illinois has the highest MMR vaccination rates, its level of educational attainments level is middling compared to other states. Similarly, while Maine that has the highest proportion of population of high school and bachelor graduates, its MMR vaccination rates are not the highest.

Relationship between MMR Vaccination Rates and Educational Attainment Level

Figure 3.10: Relationship between MMR Vaccination Rates and Educational Attainment Level

<<<<<<< HEAD

Base on the analysis Figure3.10. We can see that the MMR Vaccination Rates and educational level has weak relationship. 0.283 is a weak postive relationship. So, we can know that, the MMR vaccination has no significant relationship with educational. On the other hand, it has a lot to do with people’s consciousness.People with low levels of education are also to be vaccination.

On the reserch of Mora(The influence of education on the access to childhood immunization),they has some oppinion.The greater educational attainment level, the higher the probability of being vaccinated in this immunization programme. The presence of an age profile for vaccinations showed that less educated parents visit their GPs more frequently for immunizations when their children are below the age of six, but that pattern is completely the opposite after that age. Hence, for children aged between six and 16, more educated parents are more likely to ensure their children are immunized. Likewise, systematic vaccinations are more likely for those parents with a lower educational attainment level. (Mora and Trapero-Bertran 2018)

=======

Figure 3.10 shows that MMR vaccination rates and educational attainment level has weak positive relationship. This suggests that while better education level may suggests better MMR vaccination rates, it is not a strong indicator (Mora and Trapero-Bertran 2018).

>>>>>>> jeffendy

4 Conclusion

The increasing number of individuals that refuses vaccination increases has been becoming a significant concerns for public health. Efforts to reveal drivers behind the trends is imperative to prevent outbreaks of preventable diseases. This report aims to examine whether measles vaccination rates increases as socioeconomic condition improves. However due to inconsistency with the data collection provided by WSJ, the conclusion reached by this report might not be entirely accurate.

The analysis has revealed that while the MMR vaccination rate is generally higher than the overall vaccination rates,it seems to be that MMR vaccination rates has no clear association with socioeconomic conditions. The average of vaccination rates in public school is higher compared to private schools even though private schools’ tuition are generally higher. However, the difference between MMR and overall vaccination rates are similar in private and public school.

The schools with low MMR vaccination rates vary without apparent pattern across the state, with Arkansas having abnormally high proportion of these schools. Moreover, there is no clear association between income and MMR vaccination rates at state level. At national level, however, there are clearer association between income and MMR vaccination rates. Contrary to popular belief, MMR vaccination rates are higher in lower income bracket, and lower in higher income bracket.

In regard to migration level, what was found from our analysis is not aligned with the general study. While it is difficult to determine the association between proportion of foreign born population and MMR vaccination rates, the regression line suggest that there is a weak positive linear relationships. This is in contrary with the typical research that suggests that foreign born individuals have higher risks of under-vaccinations.

Lastly, our research found that while educational attainment is a weak indicator, states with better educational attainment tend to have higher vaccination rates.

5 References

Barrabeig, I., Rovira, A., Rius, C., Muñoz, P., Soldevila, N., Batalla, J., & Domínguez, A. (2011). Effectiveness of measles vaccination for control of exposed children. The Pediatric Infectious Disease Journal, 30(1), 78–80.

C. Sievert. Interactive Web-Based Data Visualization with R, plotly, and shiny. Chapman and Hall/CRC Florida, 2020.

Cockcroft, A., Usman, M. U., Nyamucherera, O. F., Emori, H., Duke, B., Umar, N. A., & Andersson, N. (2014). Why children are not vaccinated against measles: a cross-sectional study in two Nigerian States. Archives of Public Health = Archives Belges de Sante Publique, 72(1), 48.

H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.

Original S code by Richard A. Becker, Allan R. Wilks. R version by Ray Brownrigg. Enhancements by Thomas P Minka and Alex Deckmyn. (2018). maps: Draw Geographical Maps. R package version 3.3.0. https://CRAN.R-project.org/package=maps

Phadke, V. K., Bednarczyk, R. A., Salmon, D. A., & Omer, S. B. (2016). Association Between Vaccine Refusal and Vaccine-Preventable Diseases in the United States: A Review of Measles and Pertussis. JAMA: The Journal of the American Medical Association, 315(11), 1149–1158.

Queensland Health. (2019, October 22). What is measles and why do we vaccinate against it? Retrieved 25 August 2020, from https://www.health.qld.gov.au/news-events/news/what-is-measles-why-vaccinate#:~:text=The%20 easles%20vaccine%20contains%20a,is%20better%20prepared%20to%20respond

Sinclair, D. R., Grefenstette, J. J., Krauland, M. G., Galloway, D. D., Frankeny, R. J., Travis, C., … Roberts, M. S. (2019). Forecasted Size of Measles Outbreaks Associated With Vaccination Exemptions for Schoolchildren. JAMA Network Open, 2(8), e199768.

Shaw, J., Tserenpuntsag, B., McNutt, L.-A., & Halsey, N. (2014). United States private schools have higher rates of exemptions to school immunization requirements than public schools. The Journal of Pediatrics, 165(1), 129–133.

Tim Appelhans, Florian Detsch, Christoph Reudenbach and Stefan Woellauer (2020). mapview: Interactive Viewing of Spatial Data in R. R package version 2.9.0. https://CRAN.R-project.org/package=mapview

Yihui Xie (2020). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.29.

Yihui Xie (2015) Dynamic Documents with R and knitr. 2nd edition. Chapman and Hall/CRC. ISBN 978-1498716963

Yihui Xie (2014) knitr: A Comprehensive Tool for Reproducible Research in R. In Victoria Stodden, Friedrich Leisch and Roger D. Peng, editors, Implementing Reproducible Computational Research. Chapman and Hall/CRC. ISBN 978-1466561595

Mora and Trapero-Bertran (2018) (“National, State, and Selected Local Area Vaccination Coverage Among Children Aged 1935 Months — United States, 2014” 2015) Lu et al. (2014) Kassambara (2020) (“Measles” 2020) (“Scale Functions for Visualization [R package scales version 1.1.1]” 2020) Zhu (2020) Slowikowski (2020) rfordatascience (2020) Pebesma (2018) (“R: The R Project for Statistical Computing” 2020) TaxFoundation (2020) Wickham et al. (2019) @

Kassambara, Alboukadel. 2020. “’ggplot2’ Based Publication Ready Plots [R package ggpubr version 0.4.0].” Comprehensive R Archive Network (CRAN). https://cran.r-project.org/web/packages/ggpubr/index.html.

Kerr, Emma. 2019. “The Cost of Private Vs. Public Colleges.” U.S News, June.

Lambert, Diana, and Daniel J. Willis. 2019. “California Charter, Private Schools Report Lower Vaccination Rates Than Traditional Public Schools.” EdSource, August.

Lu, Peng-jun, Alfonso Rodriguez-Lainz, Alissa O’Halloran, Stacie Greby, and Walter W. Williams. 2014. “Adult vaccination disparities among foreign born populations in the United States, 2012.” Am. J. Prev. Med. 47 (6): 722. https://doi.org/10.1016/j.amepre.2014.08.009.

“Measles.” 2020. Australian Government Department of Health. https://www.health.gov.au/health-topics/measles#what-is-measles.

Mora, T., and M. Trapero-Bertran. 2018. “The influence of education on the access to childhood immunization: the case of Spain.” BMC Public Health 18. https://doi.org/10.1186/s12889-018-5810-1.

“National, State, and Selected Local Area Vaccination Coverage Among Children Aged 1935 Months — United States, 2014.” 2015. https://www.cdc.gov/mmwr/preview/mmwrhtml/mm6433a1.htm.

Pebesma, Edzer. 2018. “Simple Features for R: Standardized Support for Spatial Vector Data.” R Journal 10 (1): 439–46. https://journal.r-project.org/archive/2018/RJ-2018-009/index.html.

rfordatascience. 2020. “tidytuesday.” GitHub. https://github.com/rfordatascience/tidytuesday.

“R: The R Project for Statistical Computing.” 2020. https://www.r-project.org.

“Scale Functions for Visualization [R package scales version 1.1.1].” 2020. Comprehensive R Archive Network (CRAN). https://cran.r-project.org/web/packages/scales/index.html.

Slowikowski, Kamil. 2020. “Automatically Position Non-Overlapping Text Labels with.” Comprehensive R Archive Network (CRAN), March. https://cran.r-project.org/web/packages/ggrepel/index.html.

TaxFoundation. 2020. “facts-and-figures.” GitHub. https://github.com/TaxFoundation/facts-and-figures.

United States Census Buraeu. 2018. “ACS 5-Year Data Profiles.”

Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino Mcgowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the Tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.

Zhu, Hao. 2020. “Construct Complex Table with ’kable’ and Pipe Syntax [R package kableExtra version 1.2.1].” Comprehensive R Archive Network (CRAN). https://cran.r-project.org/web/packages/kableExtra/index.html.